Overview

Dataset statistics

Number of variables14
Number of observations145460
Missing cells4032
Missing cells (%)0.2%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory16.6 MiB
Average record size in memory120.0 B

Variable types

Categorical4
Numeric8
Boolean2

Alerts

Dataset has 2 (< 0.1%) duplicate rowsDuplicates
MinTemp is highly correlated with MaxTempHigh correlation
MaxTemp is highly correlated with MinTempHigh correlation
MinTemp is highly correlated with MaxTempHigh correlation
MaxTemp is highly correlated with MinTempHigh correlation
MinTemp is highly correlated with MaxTempHigh correlation
MaxTemp is highly correlated with MinTempHigh correlation
Location is highly correlated with MinTemp and 6 other fieldsHigh correlation
MinTemp is highly correlated with Location and 1 other fieldsHigh correlation
MaxTemp is highly correlated with Location and 1 other fieldsHigh correlation
WindGustDir is highly correlated with Location and 2 other fieldsHigh correlation
WindDir9am is highly correlated with Location and 2 other fieldsHigh correlation
WindDir3pm is highly correlated with Location and 2 other fieldsHigh correlation
NewHumidity is highly correlated with Location and 1 other fieldsHigh correlation
NewTemp is highly correlated with Location and 1 other fieldsHigh correlation
Rainfall has 92924 (63.9%) zeros Zeros

Reproduction

Analysis started2022-02-28 08:24:11.255255
Analysis finished2022-02-28 08:24:52.278142
Duration41.02 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Location
Categorical

HIGH CORRELATION

Distinct49
Distinct (%)< 0.1%
Missing288
Missing (%)0.2%
Memory size2.2 MiB
Canberra
 
3436
Sydney
 
3344
Brisbane
 
3193
Darwin
 
3193
Melbourne
 
3193
Other values (44)
128813 

Length

Max length16
Median length8
Mean length8.711039319
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAlbury
2nd rowAlbury
3rd rowAlbury
4th rowAlbury
5th rowAlbury

Common Values

ValueCountFrequency (%)
Canberra3436
 
2.4%
Sydney3344
 
2.3%
Brisbane3193
 
2.2%
Darwin3193
 
2.2%
Melbourne3193
 
2.2%
Adelaide3193
 
2.2%
Perth3193
 
2.2%
Hobart3193
 
2.2%
GoldCoast3040
 
2.1%
Cairns3040
 
2.1%
Other values (39)113154
77.8%

Length

2022-02-28T15:24:52.539879image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
canberra3436
 
2.4%
sydney3344
 
2.3%
brisbane3193
 
2.2%
darwin3193
 
2.2%
melbourne3193
 
2.2%
adelaide3193
 
2.2%
perth3193
 
2.2%
hobart3193
 
2.2%
townsville3040
 
2.1%
alicesprings3040
 
2.1%
Other values (39)113154
77.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

MinTemp
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct389
Distinct (%)0.3%
Missing288
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean12.21289298
Minimum-8.5
Maximum33.9
Zeros161
Zeros (%)0.1%
Negative3501
Negative (%)2.4%
Memory size2.2 MiB
2022-02-28T15:24:52.908479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-8.5
5-th percentile1.8
Q17.6
median12
Q316.9
95-th percentile23
Maximum33.9
Range42.4
Interquartile range (IQR)9.3

Descriptive statistics

Standard deviation6.396455664
Coefficient of variation (CV)0.5237461487
Kurtosis-0.4843906492
Mean12.21289298
Median Absolute Deviation (MAD)4.6
Skewness0.01343422194
Sum1772970.1
Variance40.91464506
MonotonicityNot monotonic
2022-02-28T15:24:53.192903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.61074
 
0.7%
11906
 
0.6%
10.2901
 
0.6%
9.6900
 
0.6%
12891
 
0.6%
10.5887
 
0.6%
10878
 
0.6%
10.8876
 
0.6%
9873
 
0.6%
8.9865
 
0.6%
Other values (379)136121
93.6%
ValueCountFrequency (%)
-8.51
 
< 0.1%
-8.22
 
< 0.1%
-82
 
< 0.1%
-7.81
 
< 0.1%
-7.62
 
< 0.1%
-7.52
 
< 0.1%
-7.31
 
< 0.1%
-7.21
 
< 0.1%
-7.11
 
< 0.1%
-78
< 0.1%
ValueCountFrequency (%)
33.91
 
< 0.1%
31.91
 
< 0.1%
31.81
 
< 0.1%
31.43
< 0.1%
31.21
 
< 0.1%
311
 
< 0.1%
30.72
< 0.1%
30.51
 
< 0.1%
30.31
 
< 0.1%
30.21
 
< 0.1%

MaxTemp
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct505
Distinct (%)0.3%
Missing288
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean23.22418717
Minimum-4.8
Maximum48.1
Zeros15
Zeros (%)< 0.1%
Negative114
Negative (%)0.1%
Memory size2.2 MiB
2022-02-28T15:24:53.652578image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-4.8
5-th percentile12.9
Q118
median22.7
Q328.2
95-th percentile35.5
Maximum48.1
Range52.9
Interquartile range (IQR)10.2

Descriptive statistics

Standard deviation7.110371109
Coefficient of variation (CV)0.3061623237
Kurtosis-0.2135362797
Mean23.22418717
Median Absolute Deviation (MAD)5.1
Skewness0.2184859211
Sum3371501.7
Variance50.55737731
MonotonicityNot monotonic
2022-02-28T15:24:54.004576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23.91152
 
0.8%
20893
 
0.6%
19.8846
 
0.6%
19846
 
0.6%
20.4834
 
0.6%
19.9827
 
0.6%
20.8823
 
0.6%
18.5817
 
0.6%
19.5813
 
0.6%
21812
 
0.6%
Other values (495)136509
93.8%
ValueCountFrequency (%)
-4.81
< 0.1%
-4.11
< 0.1%
-3.81
< 0.1%
-3.71
< 0.1%
-3.21
< 0.1%
-3.12
< 0.1%
-31
< 0.1%
-2.91
< 0.1%
-2.71
< 0.1%
-2.52
< 0.1%
ValueCountFrequency (%)
48.11
 
< 0.1%
47.32
< 0.1%
471
 
< 0.1%
46.91
 
< 0.1%
46.83
< 0.1%
46.72
< 0.1%
46.61
 
< 0.1%
46.51
 
< 0.1%
46.44
< 0.1%
46.32
< 0.1%

Rainfall
Real number (ℝ≥0)

ZEROS

Distinct681
Distinct (%)0.5%
Missing288
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean2.37929215
Minimum0
Maximum371
Zeros92924
Zeros (%)63.9%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-02-28T15:24:54.412347image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.8
95-th percentile13.2
Maximum371
Range371
Interquartile range (IQR)0.8

Descriptive statistics

Standard deviation8.485728267
Coefficient of variation (CV)3.566492777
Kurtosis175.0343015
Mean2.37929215
Median Absolute Deviation (MAD)0
Skewness9.734401759
Sum345406.6
Variance72.00758421
MonotonicityNot monotonic
2022-02-28T15:24:54.735840image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
092924
63.9%
0.28904
 
6.1%
0.43834
 
2.6%
0.62632
 
1.8%
0.82107
 
1.4%
11781
 
1.2%
1.21570
 
1.1%
1.41404
 
1.0%
1.61217
 
0.8%
1.81129
 
0.8%
Other values (671)27670
 
19.0%
ValueCountFrequency (%)
092924
63.9%
0.1162
 
0.1%
0.28904
 
6.1%
0.366
 
< 0.1%
0.43834
 
2.6%
0.539
 
< 0.1%
0.62632
 
1.8%
0.713
 
< 0.1%
0.82107
 
1.4%
0.915
 
< 0.1%
ValueCountFrequency (%)
3711
< 0.1%
367.61
< 0.1%
278.41
< 0.1%
268.61
< 0.1%
247.21
< 0.1%
2401
< 0.1%
236.81
< 0.1%
2251
< 0.1%
219.61
< 0.1%
216.31
< 0.1%

WindGustDir
Categorical

HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing288
Missing (%)0.2%
Memory size2.2 MiB
W
17417 
SE
9613 
N
9421 
SSE
9366 
E
9342 
Other values (11)
90013 

Length

Max length3
Median length2
Mean length2.136562147
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWSW
2nd rowNE
3rd rowW
4th rowWNW
5th rowW

Common Values

ValueCountFrequency (%)
W17417
 
12.0%
SE9613
 
6.6%
N9421
 
6.5%
SSE9366
 
6.4%
E9342
 
6.4%
S9307
 
6.4%
WSW9212
 
6.3%
SW9182
 
6.3%
SSW8842
 
6.1%
WNW8424
 
5.8%
Other values (6)45046
31.0%

Length

2022-02-28T15:24:55.149183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
w17417
 
12.0%
se9613
 
6.6%
n9421
 
6.5%
sse9366
 
6.5%
e9342
 
6.4%
s9307
 
6.4%
wsw9212
 
6.3%
sw9182
 
6.3%
ssw8842
 
6.1%
wnw8424
 
5.8%
Other values (6)45046
31.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

WindGustSpeed
Real number (ℝ≥0)

Distinct67
Distinct (%)< 0.1%
Missing288
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean39.46457306
Minimum6
Maximum135
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-02-28T15:24:55.580254image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile19
Q130
median37
Q346
95-th percentile65
Maximum135
Range129
Interquartile range (IQR)16

Descriptive statistics

Standard deviation13.62464802
Coefficient of variation (CV)0.3452374362
Kurtosis1.427839318
Mean39.46457306
Median Absolute Deviation (MAD)7
Skewness0.8808702838
Sum5729151
Variance185.6310337
MonotonicityNot monotonic
2022-02-28T15:24:55.957371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3311115
 
7.6%
359632
 
6.6%
379154
 
6.3%
398917
 
6.1%
318565
 
5.9%
417504
 
5.2%
307168
 
4.9%
436864
 
4.7%
286597
 
4.5%
445739
 
3.9%
Other values (57)63917
43.9%
ValueCountFrequency (%)
61
 
< 0.1%
719
 
< 0.1%
992
 
0.1%
11196
 
0.1%
13534
 
0.4%
15841
 
0.6%
171406
 
1.0%
194716
3.2%
202667
1.8%
222849
2.0%
ValueCountFrequency (%)
1353
 
< 0.1%
1301
 
< 0.1%
1262
 
< 0.1%
1242
 
< 0.1%
1223
 
< 0.1%
1204
< 0.1%
1174
< 0.1%
1155
< 0.1%
1138
< 0.1%
1113
 
< 0.1%

WindDir9am
Categorical

HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing288
Missing (%)0.2%
Memory size2.2 MiB
N
12496 
NW
10248 
SE
10027 
E
9679 
SSE
9679 
Other values (11)
93043 

Length

Max length3
Median length2
Mean length2.180764886
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowW
2nd rowSE
3rd rowENE
4th rowW
5th rowSW

Common Values

ValueCountFrequency (%)
N12496
 
8.6%
NW10248
 
7.0%
SE10027
 
6.9%
E9679
 
6.7%
SSE9679
 
6.7%
SW9317
 
6.4%
S9301
 
6.4%
W9024
 
6.2%
NNE8661
 
6.0%
NNW8485
 
5.8%
Other values (6)48255
33.2%

Length

2022-02-28T15:24:56.313873image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
n12496
 
8.6%
nw10248
 
7.1%
se10027
 
6.9%
sse9679
 
6.7%
e9679
 
6.7%
sw9317
 
6.4%
s9301
 
6.4%
w9024
 
6.2%
nne8661
 
6.0%
nnw8485
 
5.8%
Other values (6)48255
33.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

WindDir3pm
Categorical

HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing288
Missing (%)0.2%
Memory size2.2 MiB
SE
11928 
W
10268 
SW
10178 
S
10003 
WSW
9629 
Other values (11)
93166 

Length

Max length3
Median length2
Mean length2.207484914
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWSW
2nd rowE
3rd rowNW
4th rowW
5th rowW

Common Values

ValueCountFrequency (%)
SE11928
 
8.2%
W10268
 
7.1%
SW10178
 
7.0%
S10003
 
6.9%
WSW9629
 
6.6%
SSE9485
 
6.5%
WNW9469
 
6.5%
N9004
 
6.2%
NW8847
 
6.1%
ESE8574
 
5.9%
Other values (6)47787
32.9%

Length

2022-02-28T15:24:56.606248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
se11928
 
8.2%
w10268
 
7.1%
sw10178
 
7.0%
s10003
 
6.9%
wsw9629
 
6.6%
sse9485
 
6.5%
wnw9469
 
6.5%
n9004
 
6.2%
nw8847
 
6.1%
ese8574
 
5.9%
Other values (6)47787
32.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

RainToday
Boolean

Distinct2
Distinct (%)< 0.1%
Missing288
Missing (%)0.2%
Memory size1.4 MiB
False
112477 
True
32695 
(Missing)
 
288
ValueCountFrequency (%)
False112477
77.3%
True32695
 
22.5%
(Missing)288
 
0.2%
2022-02-28T15:24:56.801780image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing288
Missing (%)0.2%
Memory size1.4 MiB
False
112480 
True
32692 
(Missing)
 
288
ValueCountFrequency (%)
False112480
77.3%
True32692
 
22.5%
(Missing)288
 
0.2%
2022-02-28T15:24:56.896463image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

NewWindSpeed
Real number (ℝ≥0)

Distinct127
Distinct (%)0.1%
Missing288
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean16.45142314
Minimum0
Maximum88
Zeros505
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-02-28T15:24:57.099470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5.5
Q111
median15.5
Q321
95-th percentile30.5
Maximum88
Range88
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.750668259
Coefficient of variation (CV)0.4711244853
Kurtosis1.057709798
Mean16.45142314
Median Absolute Deviation (MAD)5
Skewness0.7196467507
Sum2388286
Variance60.07285846
MonotonicityNot monotonic
2022-02-28T15:24:57.795381image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
137194
 
4.9%
146633
 
4.6%
126311
 
4.3%
155553
 
3.8%
19.55402
 
3.7%
115381
 
3.7%
18.55353
 
3.7%
17.54513
 
3.1%
164475
 
3.1%
104139
 
2.8%
Other values (117)90218
62.0%
ValueCountFrequency (%)
0505
 
0.3%
1351
 
0.2%
2813
 
0.6%
31033
0.7%
3.5854
 
0.6%
4868
 
0.6%
4.51752
1.2%
5621
 
0.4%
5.52401
1.7%
6513
 
0.4%
ValueCountFrequency (%)
881
 
< 0.1%
831
 
< 0.1%
80.51
 
< 0.1%
69.52
< 0.1%
691
 
< 0.1%
66.53
< 0.1%
651
 
< 0.1%
631
 
< 0.1%
621
 
< 0.1%
61.51
 
< 0.1%

NewHumidity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct200
Distinct (%)0.1%
Missing288
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean60.17328066
Minimum0
Maximum100
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-02-28T15:24:58.190611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile27
Q148.5
median61.5
Q373
95-th percentile88.5
Maximum100
Range100
Interquartile range (IQR)24.5

Descriptive statistics

Standard deviation18.26111578
Coefficient of variation (CV)0.3034754892
Kurtosis-0.1009034926
Mean60.17328066
Median Absolute Deviation (MAD)12
Skewness-0.3595674717
Sum8735475.5
Variance333.4683497
MonotonicityNot monotonic
2022-02-28T15:24:58.613486image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
612102
 
1.4%
621731
 
1.2%
661665
 
1.1%
63.51665
 
1.1%
641662
 
1.1%
60.51637
 
1.1%
58.51628
 
1.1%
64.51627
 
1.1%
59.51621
 
1.1%
631613
 
1.1%
Other values (190)128221
88.1%
ValueCountFrequency (%)
01
 
< 0.1%
1197
0.1%
1.52
 
< 0.1%
24
 
< 0.1%
2.56
 
< 0.1%
36
 
< 0.1%
3.512
 
< 0.1%
416
 
< 0.1%
4.512
 
< 0.1%
524
 
< 0.1%
ValueCountFrequency (%)
100230
0.2%
99.5135
0.1%
99199
0.1%
98.5176
0.1%
98301
0.2%
97.5177
0.1%
97231
0.2%
96.5177
0.1%
96253
0.2%
95.5269
0.2%

NewPressure
Real number (ℝ≥0)

Distinct1407
Distinct (%)1.0%
Missing288
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean1016.708898
Minimum979.75
Maximum1040.05
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 MiB
2022-02-28T15:24:58.933786image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum979.75
5-th percentile1005.6
Q11012.3
median1016.75
Q31021.05
95-th percentile1027.75
Maximum1040.05
Range60.3
Interquartile range (IQR)8.75

Descriptive statistics

Standard deviation6.847634479
Coefficient of variation (CV)0.006735098405
Kurtosis0.239444936
Mean1016.708898
Median Absolute Deviation (MAD)4.4
Skewness-0.07660944875
Sum147597664.2
Variance46.89009796
MonotonicityNot monotonic
2022-02-28T15:24:59.242385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1017.253433
 
2.4%
1019.353171
 
2.2%
1014.653170
 
2.2%
1027.33034
 
2.1%
1014.81482
 
1.0%
1019.8828
 
0.6%
1020.2686
 
0.5%
1013.4513
 
0.4%
1011.55452
 
0.3%
1016.25401
 
0.3%
Other values (1397)128002
88.0%
ValueCountFrequency (%)
979.751
< 0.1%
981.051
< 0.1%
982.051
< 0.1%
982.62
< 0.1%
983.051
< 0.1%
9841
< 0.1%
984.21
< 0.1%
984.351
< 0.1%
984.51
< 0.1%
984.751
< 0.1%
ValueCountFrequency (%)
1040.051
< 0.1%
1039.651
< 0.1%
1039.551
< 0.1%
1039.351
< 0.1%
1039.252
< 0.1%
1039.151
< 0.1%
1039.051
< 0.1%
10391
< 0.1%
1038.952
< 0.1%
1038.91
< 0.1%

NewTemp
Real number (ℝ)

HIGH CORRELATION

Distinct1437
Distinct (%)1.0%
Missing288
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean19.3338774
Minimum-6.3
Maximum41.6
Zeros7
Zeros (%)< 0.1%
Negative286
Negative (%)0.2%
Memory size2.2 MiB
2022-02-28T15:24:59.560826image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-6.3
5-th percentile9.55
Q114.55
median19
Q323.9
95-th percentile30.3
Maximum41.6
Range47.9
Interquartile range (IQR)9.35

Descriptive statistics

Standard deviation6.484813552
Coefficient of variation (CV)0.3354119517
Kurtosis-0.2798084092
Mean19.3338774
Median Absolute Deviation (MAD)4.65
Skewness0.1245190398
Sum2806737.65
Variance42.05280681
MonotonicityNot monotonic
2022-02-28T15:24:59.878422image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21.55614
 
0.4%
17.25479
 
0.3%
16.4467
 
0.3%
16.35463
 
0.3%
17448
 
0.3%
15.75448
 
0.3%
17.5446
 
0.3%
20.5446
 
0.3%
18.85444
 
0.3%
19.75444
 
0.3%
Other values (1427)140473
96.6%
ValueCountFrequency (%)
-6.31
< 0.1%
-5.51
< 0.1%
-5.251
< 0.1%
-4.951
< 0.1%
-4.751
< 0.1%
-4.651
< 0.1%
-4.61
< 0.1%
-4.451
< 0.1%
-4.41
< 0.1%
-4.351
< 0.1%
ValueCountFrequency (%)
41.61
< 0.1%
41.251
< 0.1%
41.11
< 0.1%
41.11
< 0.1%
40.91
< 0.1%
40.81
< 0.1%
40.752
< 0.1%
40.71
< 0.1%
40.51
< 0.1%
40.451
< 0.1%

Interactions

2022-02-28T15:24:45.754580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:26.064108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:28.515290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:30.957327image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:33.652109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:36.339235image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:39.429119image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:42.778994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:46.109195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:26.371176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:28.796727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:31.247782image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:33.948708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:36.623798image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:39.761184image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:43.138872image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:46.446343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:26.661034image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:29.099584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:31.544914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:34.262130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:36.921417image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:40.138653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:43.536024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:46.842824image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:26.979015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:29.430347image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:31.863499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:34.586389image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:37.277682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:40.810041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:43.878533image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:47.255732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:27.313435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:29.739980image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:32.199159image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:34.961083image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:37.849546image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:41.229963image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:44.216974image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:47.611916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:27.607410image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:30.029241image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:32.716199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:35.342144image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:38.280804image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:41.636878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:44.558506image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:47.930803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:27.930017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:30.345065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:33.017645image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:35.687588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:38.623277image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:42.010829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:44.886159image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:48.232322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:28.227023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:30.658109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:33.357134image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:36.030050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:38.996858image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:42.362756image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-02-28T15:24:45.236046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2022-02-28T15:25:00.155893image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-02-28T15:25:00.520922image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-02-28T15:25:00.870965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-02-28T15:25:01.207045image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-02-28T15:25:01.528859image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-02-28T15:24:48.860846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-02-28T15:24:49.978158image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-02-28T15:24:51.081079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-02-28T15:24:51.809710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

LocationMinTempMaxTempRainfallWindGustDirWindGustSpeedWindDir9amWindDir3pmRainTodayRainTomorrowNewWindSpeedNewHumidityNewPressureNewTemp
0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN22.534.01008.1522.10
1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN10.030.51015.2022.30
2Albury12.925.70.0WSW46.0WWSWNoNo13.557.51008.4023.75
3Albury9.228.00.0NE24.0SEENoNo21.539.01007.3024.75
4Albury17.532.31.0W41.0ENENWNoNo22.034.01008.9021.35
5Albury14.629.70.2WNW56.0WWNoNo11.533.51011.7520.90
6Albury14.325.00.0W50.0SWWNoNo17.525.51006.2524.25
7Albury7.726.70.0W35.0SSEWNoNo13.042.51006.3524.15
8Albury9.731.90.0NNW80.0SENWNoYes11.535.01010.2524.60
9Albury13.130.11.4W28.0SSSEYesNo14.090.01007.3516.45

Last rows

LocationMinTempMaxTempRainfallWindGustDirWindGustSpeedWindDir9amWindDir3pmRainTodayRainTomorrowNewWindSpeedNewHumidityNewPressureNewTemp
145450Uluru5.224.30.0E24.0SEENoNoNaNNaNNaNNaN
145451Uluru6.423.40.0ESE31.0SESENoNoNaNNaNNaNNaN
145452Uluru8.020.70.0ESE41.0SEENoNoNaNNaNNaNNaN
145453Uluru7.420.60.0E35.0ESEENoNoNaNNaNNaNNaN
145454Uluru3.521.80.0E31.0ESEENoNoNaNNaNNaNNaN
145455Uluru2.823.40.0E31.0SEENENoNoNaNNaNNaNNaN
145456Uluru3.625.30.0NNW22.0SENNoNoNaNNaNNaNNaN
145457Uluru5.426.90.0N37.0SEWNWNoNoNaNNaNNaNNaN
145458Uluru7.827.00.0SE28.0SSENNoNoNaNNaNNaNNaN
145459Uluru14.927.00.0SE28.0ESEESENoNoNaNNaNNaNNaN

Duplicate rows

Most frequently occurring

LocationMinTempMaxTempRainfallWindGustDirWindGustSpeedWindDir9amWindDir3pmRainTodayRainTomorrowNewWindSpeedNewHumidityNewPressureNewTemp# duplicates
0Melbourne17.623.90.0S31.0SSNoNo18.561.01019.8021.552
1MountGinini-0.29.20.0SW35.0NWNWNoNo16.530.01017.2514.102